UDS-FIM: An Efficient Algorithm of Frequent Itemsets Mining over Uncertain Transaction Data Streams

نویسندگان

  • Le Wang
  • Lin Feng
  • Mingfei Wu
چکیده

In this paper, we study the problem of finding frequent itemsets from uncertain data streams. To the best of our knowledge, the existing algorithms cannot compress transaction itemsets to a tree as compact as the classical FPTree, thus they need much time and memory space to process the tree. To address this issue, we propose an algorithm UDS-FIM and a tree structure UDS-Tree. Firstly, UDS-FIM maintains probability values of each transactions to an array; secondly, compresses each transaction to a UDS-Tree in the same manner as an FP-Tree (so it is as compact as an FP-Tree) and maintains index of probability values of each transaction in the array to the corresponding tail-nodes; lastly, it mines frequent itemsets from the UDSTree without additional scan of transactions. The experimental results show that UDS-FIM has achieved a good performance under different experimental conditions in terms of runtime and memory consumption.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Incremental updates of closed frequent itemsets over continuous data streams

Online mining of closed frequent itemsets over streaming data is one of the most important issues in mining data streams. In this paper, we propose an efficient one-pass algorithm, NewMoment to maintain the set of closed frequent itemsets in data streams with a transaction-sensitive sliding window. An effective bit-sequence representation of items is used in the proposed algorithm to reduce the...

متن کامل

An Efficient Algorithm to Mine Online Data Streams

Mining frequent closed itemsets provides complete and condensed information for non-redundant association rules generation. Extensive studies have been done on mining frequent closed itemsets, but they are mainly intended for traditional transaction databases and thus do not take data stream characteristics into consideration. In this paper, we propose a novel approach for mining closed frequen...

متن کامل

An Efficient Algorithm for Maintaining Frequent Closed Itemsets over Data Stream

Data mining refers to the process of revealing unknown and potentially useful information from a large database. Frequent itemsets mining is one of the foundational problems in data mining, which is to discover the set of products that purchased frequently together by customers from a transaction database. However, there may be a large number of patterns generated from database, and many of the...

متن کامل

Mining frequent itemsets over data streams using efficient window sliding techniques

Online mining of frequent itemsets over a stream sliding window is one of the most important problems in stream data mining with broad applications. It is also a difficult issue since the streaming data possess some challenging characteristics, such as unknown or unbound size, possibly a very fast arrival rate, inability to backtrack over previously arrived transactions, and a lack of system co...

متن کامل

Mining Frequent Itemsets Over Arbitrary Time Intervals in Data Streams

Mining frequent itemsets over a stream of transactions presents di cult new challenges over traditional mining in static transaction databases. Stream transactions can only be looked at once and streams have a much richer frequent itemset structure due to their inherent temporal nature. We examine a novel data structure, an FP-stream, for maintaining information about itemset frequency historie...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • JSW

دوره 9  شماره 

صفحات  -

تاریخ انتشار 2014